Genetic data sharing and artificial intelligence in the era of personalized medicine based on a cross‐sectional analysis of the Saudi human genome program

The success of the Saudi Human Genome Program (SHGP), one of the top ten genomic programs worldwide, is highly dependent on the Saudi population embracing the concept of participating in genetic testing. However, genetic data sharing and artificial intelligence (AI) in genomics are critical public issues in medical care and scientific research. The present study was aimed to examine the awareness, knowledge, and attitude of the Saudi society towards the SHGP, the sharing and privacy of genetic data resulting from the SHGP, and the role of AI in genetic data analysis and regulations. Results of a questionnaire survey with 804 respondents revealed moderate awareness and attitude towards the SHGP and minimal knowledge regarding its benefits and applications. Respondents demonstrated a low level of knowledge regarding the privacy of genetic data. A generally positive attitude was found towards the outcomes of the SHGP and genetic data sharing for medical and scientific research. The highest level of knowledge was detected regarding AI use in genetic data analysis and privacy regulation. We recommend that the SHGP’s regulators launch awareness campaigns and educational programs to increase and improve public awareness and knowledge regarding the SHGP’s benefits and applications. Furthermore, we propose a strategy for genetic data sharing which will facilitate genetic data sharing between institutions and advance Personalized Medicine in genetic diseases’ diagnosis and treatment.

www.nature.com/scientificreports/ Subject recruitments. An electronic format of the questionnaire consisted of an introduction of the study's aims, including the importance of voluntary contribution in the study and a consent statement. The questionnaire was distributed via different social media platforms in Saudi Arabia including Twitter, WhatsApp and Telegram. Saudis are very active in these platforms, for example, they ranked seventh in the world in terms of Twitter users (12.7 million). All Saudi citizens aged ≥ above or equal than 18 years were targeted to participate in the study. More than 844 responses were received, and exclusion criteria were (a) None-Saudi, (b) less than 18 years old, and (c) incomplete responses.

Study instruments.
The questionnaire was designed, validated and the electronic format was created using Google Forms. The validated version of the survey consisted of six sections: (1) social and demographic information including, age, gender, educational level, and nationality, (2) participants' awareness of genetic diseases (6 items), (3) participants' awareness of the SHGP (8 Items), (4) Saudi citizens' knowledge and attitude of genetic data privacy of the SHGP (9 Items), (5) attitude toward the use of AI in the genome and the privacy management of genetic data (6 Items) and (6) attitude toward sharing genetic data in scientific research (2 Items).
Statistical analysis. All responses were imported and categorized into Excel spreadsheets for descriptive and statistical analyses. The statistical software programs SAS (version 9.4) and SPSS (version 25) were used to perform t-tests and multivariate statistics ANOVA to analyse several significant variables, including the level of public knowledge and awareness regarding the SHGP, genetic data privacy/sharing and AI use. Statistical significance was considered at a P value of less than 0.05 for all analyses.
Excluding responses. We excluded 40 respondents who chose "non-Saudi" since we could not confirm if they lived in the SA.
Informed consent statement. Informed consent was obtained from all subjects involved in the study. www.nature.com/scientificreports/ Awareness of genetic diseases among participants. The SHGP was launched to study the causes of the high prevalence of genetic disorders and detect rare inherited diseases among Saudi citizens. Therefore, we investigated the level of public awareness about different aspects of genetic diseases in SA as shown in Table 2 and Supplementary Fig. 2. Approximately 74.3% of study participants were aware of the high prevalence of genetic diseases among Saudis. Almost all participants (93.8%) knew genetic diseases negatively impact affected individuals and their families. Most participants (90.2%) were aware of the role of consanguinity in the increase of genetic disease incidence. Interestingly, only 19.8% of participants had undergone genetic testing, but nearly all participants (95.6%) had a positive attitude and high awareness of the importance of pre-marital screening in reducing the prevalence of inherited diseases. Further analysis revealed that overall awareness of genetic diseases was significantly higher in females than males (p = 0.0094) as shown in Table 3.

Awareness and attitude toward the SHGP. Despite the massive media campaign launched in 2021
about the SHGP, only 40.5% of study respondents had heard of the SHGP as shown in Table 4. Moreover, 73.8% of participants were not aware of the benefits and applications of the SHGP. The vast majority of participants (82.1%) assumed that the SHGP would document the first genetic map of Saudi citizens. Approximately 86.3% of respondents chose "yes" for the possible contribution of the SHGP to gene therapy development. Furthermore, 87.2% of participants had a positive attitude toward the contribution of the SHGP in the localization of genomic techniques and genetic research. Only 4.6% of participants were among the sample donors in the program, but 68.8% of them were willing to participate. More than 80% of participants were optimistic about the contribu-   Knowledge and attitude toward genetic data privacy of the SHGP. Nine items in the survey questionnaire focused on examining the level of knowledge and attitudes toward genetic data privacy of the SHGP ( Interestingly, there was uncertainty regarding the level of knowledge of the importance of the privacy and the security of genetic data, as the responses were divided between the lowest level (28.4%), medium level (20.6%), and highest level (26.2%); the remainder was not sure. A majority of participants (79.7%) felt the highest level of positive attitude and support for obtaining the patient's consent before sharing their genetic data. Similarly, the highest level of attitude and support were reported regarding the need for a general policy for the privacy of genetic data (78.1%). Importantly, most participants (75.4%) showed the highest level of positive attitude toward the importance of organizing seminars to introduce the knowledge related to privacy and security of genetic data. Positively, most participants supported genetic data sharing in scientific and medical research and establishing a national policy to protect genetic data privacy when shared between Saudi institutions (Fig. 1).
Attitudes toward the use of AI in the privacy of genetic data. As massive genetic data are generated and become big data, rapid and accurate analysis is required AI to provide clinical reports for health diagnoses or other related tasks in research or medical fields. Thus, we investigated the attitude and opinions of Saudi society about the involvement of AI in the privacy of genetic data and its role in SHGP data analysis (Table 6). Surprisingly, 92.8% of participants agreed that AI could be used to analyse genetic data. Furthermore, most participants (80.6%) agreed to AI contributing to solving genetic disorders. A vast majority of participants (90.7%) agreed that AI technologies could provide solutions to ensure the privacy of genetic data. Most participants (88.8%) agreed with employing AI in managing the privacy of genetic data. However, the participants were divided regarding the threat of AI use in the privacy of genetic data as 41.2% chose "agree" and 58.8% chose "do not agree". Positively, 90.7% of participants agreed that AI could be used in the SHGP.
The statistical analysis showed that the attitude toward using AI in the SHGP was significantly different by educational level (F (3, 801) = 4.68, p = 0.0030). Participants with a postgraduate degree (p = 0.0110, M = 5.043) had more attitude toward using AI in the privacy of genetic data and the SHGP than those with a bachelor's degree (M = 4.574). Moreover, there was a statistically significant difference by marital status (F (3, 801) = 7.28, P < 0.0001). People who were married (P < 0.0001, M = 5.040) had more attitude toward the use of AI in the privacy of genetic data and the SHGP than single people (M = 4.574). Furthermore, specific age groups were significantly different (F (4, 800) = 4.35, P = 0.0018). People who were 38 years old to less than 48 years old (P = 0.0100, M = 5.055) had more attitude toward the use of AI in the privacy of genetic data and the SHGP than people who were of age 18 years old to less than 28 years old (M = 4.666). www.nature.com/scientificreports/

Discussion
The SHGP was recently established to detect and study the causes of genetic disorders. In this study, we found that most participants were aware of the high prevalence of genetic diseases among Saudis (Table 2). Most participants considered consanguinity as a factor in genetic diseases. Nearly, all participants had a positive attitude and sufficient awareness of pre-marital screening in reducing the prevalence of inherited diseases. These results are consistent with our previous study such that other reports showed the Saudi community has a high level of awareness toward genetic testing [28][29][30][31] . One possible reason for these positive findings, is that in 2002, the Saudi government passed a law requiring pre-marital genetic testing 32 . Interestingly, the results of our study also revealed that females had significantly higher awareness of genetic diseases than males. We then examined the awareness and attitude toward the SHG, and found inadequate awareness about the SHGP and its benefits and applications (Table 4). Thus, there is a need for greater efforts to educate people about the SHGP and human genome in general. Furthermore, we documented that a high parentage of participants assumed that the SHGP would establish the first genetic map of the Saudi community. There is a positive attitude among the responses regarding the contribution of the SHGP in gene therapy and the localization of genomic techniques. Moreover, the responses showed encouraging results (68.8%) in willingness to participate in the SHGP sample collection initiative. The participants were generally optimistic about the SHGP outcomes, potentially lowering the prevalence of genetic diseases and their negative impacts. In addition, the analysis revealed that knowledge and attitudes concerning the SHGP were not statistically significant in comparison with the effect of status and age. However, there was a significant correlation between educational attainment and awareness level as people with postgraduate degrees were more aware of the SHGP than those with bachelor's degrees.
Regarding the level of knowledge and attitude toward genetic data privacy and management of the SHGP data, an insufficient level of knowledge was reported (Table 5). The participants did not have enough knowledge regarding the process of preserving and managing genetic data, and less than half did not know the institutions responsible for storing the genetic data in the SA. Regrading genetic data privacy and security, uncertainty and a low level of knowledge were detected among respondents. A high rate of concern about patient privacy was reported as most participants called for informed consent before sharing their genetic data. Similarly, the highest level of attitude and support was detected for applying general policy to genetic data privacy. Importantly, most responses exhibited the highest level of positive attitude toward the importance of organizing seminars  www.nature.com/scientificreports/ to introduce the knowledge related to privacy and security of genetic data. We noticed some contradictions responses in a few questions related to genetic data privacy and genetic data sharing. For instance, 43.8% of participants did not know the institutions responsible for storing Saudi genetic date while 33.6% of them chose the highest level of knowledge regarding the management of genetic data with high privacy in the SA (Table 5). These contradictions could be a result of low level of knowledge and awareness about these issues among the participants.
Positively, most participants supported genetic data sharing in scientific and medical research and the establishment of a national policy to protect the privacy of genetic data when it is shared between Saudi institutions (Fig. 1). We found that the public support genetic data sharing if their privacy and personal information are secured. Consistent with this, a study conducted in Riyadh, the SA, showed that 78.4% of the participants are in favour of building a database of hereditary diseases and managed by the government 28 . However, several reports have shown that the public is always concerned about data misuse, and being identified, and stigmatized with genetic diseases 23,24,26,27 . For example, surveys were conducted in Pennsylvania (the United States) and Bavaria (Germany) about Personalized Medicine showed that most participants were worried about genetic data misuse 33 . Notwithstanding, the general public trusts researchers in the hope of finding cures for complex diseases. Based on these findings, we propose a strategy for sharing the SHGP data that ensure the privacy and security of genetic data (Fig. 2). The sharing of genetic data will broaden opportunities for researchers and medical practitioners to accelerate gene therapy discovery, improve the diagnosis of genetic diseases and develop personalized medicine for patients 27,34,35 . Consistent with this idea, other investigators have called to establish a national genomic datasharing policy in the SA that allows data to be freely shared among institutions to enhance bio-marker discovery and computational biology analysis, improving the treatment of genetic disease complications 34,35 . A lack of a genetic data sharing policy will limit the use, access, and analysis of the SHGP data. A genetic data sharing policy will regulate the privacy of genetic data if it is shared with a third party and how it is shared. Also, the policies should regulate how the genetic data is collected, stored, and provided in its legal state.
We also further investigated public attitudes toward the use of Al in the analysis of genetic data and privacy regulation in the SHGP (Table 6). We reported the highest positive attitude toward AI use in genetic data analysis. Furthermore, most participants trusted the ability of AI to solve genetic disorders. In terms of the privacy of genetic data, a vast majority of responses indicated that AI technologies could ensure and manage privacy. However, the participants had divided opinions regarding the threat of AI use in privacy regulations. Almost all participants had positive attitudes toward the use of AI in the SHGP. Furthermore, our statistical analysis revealed that the attitude toward using AI in the SHGP was significantly different by educational level (F (3, 801) = 4.68, p = 0.0030). Participants with a postgraduate degree (p = 0.0110, M = 5.043) had a higher positive attitude on employing AI in the privacy of genetic data and the SHGP than those with a bachelor's degree (M = 4.574). Moreover, People aged 38-48 (P = 0.0100, M = 5.055) had a more positive attitude about the use of AI in the privacy of genetic data.
Surprisingly, participants showed a higher level of positive attitude and knowledge toward Al applications than the SHGP and its benefits. This result could be because Al is trending now in the SA. More specifically, the Proposed strategy for genetic data sharing of the SHGP. First, a national policy for genetic data sharing should be established. Second, advanced technologies should be used to ensure genetic data security and privacy. Third, laws governing genetic data regulation must be enforced. Finally, national awareness campaigns and educational programs should be launched among clinician, physicians, researchers, and the general public. www.nature.com/scientificreports/ government has established Saudi Data and Artificial Intelligence Authority (SDAIA). In addition to this, several media campaigns have presented information about AI and its Applications 36 .
Despite some concerns about AI use in health care and genomic data, such as inaccuracies, discrimination, and bias in the database, AI algorithms will revolutionize genomics and proteomics data analysis, improving precision medicine in genetic disease diagnosis and treatment 23,37 . AI algorithms, more specifically, deep learning based algorithms are currently being employed in clinical diagnosis and analysis of complex and large-scale genomic databases. However, AI based algorithms may require huge databases to train to improve genomic data analysis and drug discovery. Therefore, genetic data sharing will definitely improve the use of AI in the SHGP and Personalized Medicine. Furthermore, AI and privacy technologies could provide solutions for genetic data sharing, for example, cryptography, differential privacy and other approaches 23,24,38,39 .
In the current study, we analysed and assessed Saudi public awareness, knowledge, and attitudes toward the SHGP, genetic data privacy and the role of AI in the management of privacy and the analysis of genetic data. To the best of our knowledge, this study is the first population-based survey of Saudi public awareness and knowledge toward the SHGP. We anticipate that the outcome of this study can help decision-makers involved in SHGP management and genetic data regulation plan public communication strategically, implement SHGP findings, and establish a national genetic data sharing policy.

Conclusion
This study provides insights regarding the Saudi society's awareness, knowledge, and attitude towards the SHGP, the sharing and privacy of genetic data resulting from the SHGP, and the role of AI in managing privacy and analysing genetic data. We reported moderate awareness and attitude towards the SHGP and minimal knowledge regarding its benefits and applications. In addition, a low level of knowledge was observed regarding sharing and privacy of genetic data. A generally positive attitude was found towards the outcomes of the SHGP and genetic data sharing for medical and scientific research. Furthermore, the highest level of knowledge was detected regarding AI use in genetic analysis and privacy regulations. We identified gender, status and educational level as important factors in public awareness and knowledge of the SHGP. Furthermore, we proposed a strategy for genetic data sharing in Saudi Arabia. We recommend that awareness campaigns and educational programs be launched by institutions that manage the SHGP to increase and improve public awareness and fill the knowledge gaps regarding these issues. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.